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An XML Document Editor 
Field of Invention 

The present invention relates to an XML document editor, especially to an XML 
document editor that generates documents complying with a given DTD or schema. 
5 Background of the Invention 

Extensible markup language (XML) is a new generic representation for data and 
documents. It has emerged as a standard for information exchange over the network. 
Moreover, many application domains have developed DTD (Document Type 
Definition) or schema-based XML vocabularies for their data and documents to be 

10 coded in XML and processed by standard software packages. XML thus has great 
impact on all these application domains. 

Fundamental to XML applications is the creation and editing of XML documents. 
This is evidenced by the large number of commercial and public-domain XML editors 
released so far. An XML document editor (hereinafter referred to as "XML editor") 

15 allows its user to produce either well-formed and/or syntax valid XML documents. 
Such an XML editor typically maintains a well-formed working document and 
provides validity checking upon the user's request. The user has all the freedom to edit 
the document. To ensure the document to be valid, the user must check for validity and 
correct the syntactic violations in the document reported by the validity checker. This 

20 requires the user's knowledge about the DTD or schema of the document. Another 
approach to relieve the user from syntactic concerns is to provide extensive 
customization specific to XML vocabularies. However, this requires the intervention of 
a technical expert. 

XML editors in use today typically support the so-called context-sensitive editing. 
25 While a user is editing a document, the system provides guidance or hints regarding 
what the user can do next so as to produce a valid document eventually. This policy is 



helpful. However, the guidance or hints these XML editors provide are typically too 
loose to guarantee the validity of the resulting document. That becomes a major reason 
why these XML editors need a validity checker. 

Current XML editors appear to differ more in their user interfaces than their 
underlying editing methodologies and, thus, may be classified into 4 types according to 
their user interface. They are: user interfaces based on text views, tree views, 
presentation views and forms. Some systems support multiple types of user interfaces. 
A text view XML editor allows the user to edit tags as in a text editor. A tree view editor 
displays the hierarchical structure of an XML document as a tree and provides 
operations for inserting and deleting tree nodes. A presentation view applies some 
kinds of transformations, including XSLT, CSS and proprietary ones, to an XML 
document in order to present a "WYSIWYG" user interface to its user. A presentation 
view specific to a vocabulary typically requires customization and support by technical 
experts. 

A form-based XML editor is essentially a form generator based on a data model 
specified by an XML schema (or DTD). Current form-based XML editors are not 
robust enough to handle general XML data, not to mention their ability to avoid 
syntactic violations upon data entry. 

These XML editors do not enforce syntactic constraints strictly during document 
construction and provide validity checking as warning/hints for correcting syntactic 
violations among other typically inaccurate context-sensitive guidance and hints. These 
XML editors allow users to create and edit both well-formed and valid XML 
documents in a single editing mode. 

Objectives of the Invention 

An objective of this invention is to provide an XML document editor dedicated to 
generating XML documents complying with a given DTD or schema. 
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Another objective of this invention is to provide an XML document editor such 
that syntactic violations against applicable DTD or schema may be avoided. 

Another objective of this invention is to provide an XML document editor that is 
able to generate accurate guidance and hints to users in the process of document 
5 creation. 

Another objective of this invention is to provide an XML document editor to 
simplify users' document creation or edition process. 

Another objective of this invention is to provide an XML document editor with 
which an user does not need to be a technical expert to create or edit an XML document. 

10 Another objective of this invention is to provide an XML document editor that is 

able to support user interface of various types. 

Another objective of this invention is to provide an XML document editor that is 
able to support a form-based user interface. 

Summary of the Invention 

15 According to the present invention, a DTD-compliant XML document editor is 

disclosed. The XML document editor generates hints for required elements and 
required element slots automatically in the process of document creation and edition so 
as to guide the user to produce a valid document, while syntactic violations are avoided 
at the same time. The editor also suggests optional elements that may be added into the 

20 document to be edited. The user requires no knowledge about XML and DTD to edit 
DTD-compliant XML documents. The editing process is user-interface-neutral, being 
able to support user interfaces based on tree views, presentation views and forms. By 
combining the DTD-compliant editing process with the simplicity of forms, a simple 
XML document editor with forms as its user interface is developed. 

25 These and other objectives and advantages of this invention may be clearly 

understood from the detailed description by referring to the following drawings. 



Brief Description of the Drawings 

In the drawings, 

Fig. 1 shows the Glushkov automaton for regular expression E, E = ((a \ b)*, c). 
Fig. 2 shows the Glushkov automaton for regular expression E= (a, b, c* (d \ 
5 e+))* . 

Fig. 3 illustrates the top-level form of the user interface of an embodiment of the 
XML document editor of this invention. 

Fig. 4 illustrates the screen display of optional element slots of the user interface of 
the XML document editor of this invention. 
10 Fig. 5 illustrates the nest form of the user interface of the XML document editor of 

this invention. 

Detailed Description of the Invention 
The purpose of this invention is to provide a DTD-compliant XML document 
editor that has the following unique features: 
15 First, the editor allows the user to edit XML documents without the violation of 

syntactic constraints given by the applicable DTD (document type definition) or 
schema, whereby the user does not need to correct any syntactic violations. 

Secondly, the editor generates hints for elements and element slots (placeholders 
for the user to add elements) automatically when it detects that they are required to 
20 make the current document valid. 

Third, the editor generates optional element slots upon the user's request. Elements 
and element slots are generated in response to the user's addition of elements. 

Last but not least, the DTD-compliant document editor does not require the user's 
knowledge about XML. The user only needs to know the application-specific meaning 
25 of elements and attributes in an XML vocabulary. 
Glushkov Automata 



An XML document has a nested multi-level hierarchical structure. For the purpose 
of syntactic validity, it is possible to consider the syntactic constraint imposed on each 
element and its child elements separately. The content type of an element is defined by 
a regular expression (to be explained in details hereinafter), which determines the valid 
5 sequence of occurrences of its child element types. Here, the notation of DTD is 
adopted to express regular expressions. 

Although it is not intended to limit the scope of this invention, the XML editor of 
this invention may be supported by the theory of automata. A regular expression E over 
a finite alphabet of symbols I ={xi, x 2 , x n } is simple if each symbol Xj appears in E 
10 only once. The language L specified by E can be recognized by a deterministic finite 
automaton (DFA) G, known as the Glushkov automaton, defined as follows: 

(1) Every symbol of S is a state of G. G has two additional states s and f as its start 
and final states, respectively. (If L contains the empty string, s is also a final 
state.) 

15 (2) The transition function 5(x,, Xj) = Xj for any x\ G S and Xj G follow(xj), i.e. Xj 

immediately follows Xj in some string in L. 
5(s, Xj) = Xj for any Xj G first(E), i.e. Xj is the first symbol in some strings in L. 
5(x i? $) = f for any Xj G last(E), i.e. Xj is the last symbol in some strings in L, where 
$ is a special end symbol appended to every string. 
20 Note that the functions first(E), last(E) and follow(xi) can be computed easily by 

traversing the tree structure of E once. Take the simple expression E = ((a | b) * c) as an 
example. We have first(E) = {a,b,c}, last(E) = {cj, follow(a) = follow(b) = {a,b,c}, and 
follow(c) = {}. The Glushkov automaton G for E is as shown in Figure 1 . 

Edges in the Glushkov automaton G are of several types: If states El and E2 are 
25 two subexpressions of E in sequence, i.e. (El, E2) is a subexpression of E, then G 
contains a sequence edge (u,v) for every u G last(El) and every v G first(E2). (A regular 



expression E has a natural tree representation. A subexpression of E corresponds to a 
subtree of the tree representation of E.) Sequence edges and edges from the start state s 
and edges to the final state f are referred to as forward edges collectively. If El* is a 
subexpression of E, then G contains an iteration edge (u,v) for every u G last(El) and 
5 every v G first(El). In general, an edge may be a sequence edge as well as an iteration 
edge. An iteration edge that is not a sequence edge is referred to as a backward edge. 
For the DFA G in Figure 1, edges (a,c) and (b,c) are sequence edges as well as forward 
edges. Edges (s,a), (s,b), (s,c) and (cj) are forward edges. Edges (a,a) 9 (a,b), (b,a), and 
(b,b) are iteration edges as well as backward edges. 

10 For a subexpression El of E, let A(E1) denote the set of symbols in Z that El 

includes. For instance, if El= (a \ b)*, then A(E1)= {a,b}. Let reachable(u) denote the 
set of states in G reachable from state u, i.e. reachable(u) = {z | there exists a path in G 
from u to z}. A forward path is a path only consisting of forward edges. Let 
f-reachable(u) denote the set of states in G reachable from u through forward edges, i.e. 

15 f-reachable(u) = {z | there exists a forward path in G from u to z }. With these 
definitions, we can make the following observations. 
Lemma 1 . The forward edges in G form no cycles. 

Proof: We can label all symbols in E such that the labels are in increasing order 
from left to right. The forward edges are always from symbols at lower orders to 
20 symbols at higher orders. Thus they cannot form cycles. 

Lemma 2. Let El be a subexpression of E. For any x C AfEl), there exists some u 
C firstfEn and v C last(El) such that x C f-reachable(u) and v C f-reachable(x). 

Proof: The lemma is trivial if El is a symbol. It is sufficient to prove the lemma on 
the basis that the child subexpressions of El satisfy the lemma. There are three cases: 
25 (1) El is an iteration, i.e. El= E2* for some subexpression E2, which satisfies the 

lemma. Since A(E1)=A(E2), first(El)= first(E2), and last(El)= last(E2), El satisfies 



the lemma as E2 does. 

(2) El is a choice, i.e. El= (E2 | E3), where E2 and E3 satisfy the lemma. For any x 
G A(E1), we have either x G A(E2) or x € A(E3), say x G A(E2). (The case for x G 
A(E3) is symmetric.) Since E2 satisfies the lemma, there exists some u G first(E2) and v 

5 G last(E2) such that x G f-reachable(u) and v G f-reachable(x). Since first(El)= first(E2) 
U first(E3) a first(E2) and last(E 1 )= last(E2) U last(E3) 3 last(E2), we have proven the 
lemma for EL 

(3) El is a sequence, i.e. El= (E2, E3), where E2 and E3 satisfy the lemma. For 
any x G A(E1), we have either x G A(E2) or x G A(E3), say x G A(E2). (The case for x 

10 G A(E3) is similar.) Since E2 satisfies the lemma, there exists some u G first(E2) and z 
G last(E2) such that x G f-reachable(u) and z G f-reachable(x). Since first(E2) c: 
first(El), we have u G first(El). Since E3 satisfies the lemma, there exists some w G 
first(E3) and v G last(E3) c last(El) such that v G f-reachable(w). Since El= (E2, E3), 
there is an edge from z to w. We have found a forward path from x to z, w, and finally v 

1 5 G last(E 1 ), which completes the proof for the lemma. 

Lemma 3. Let El* be a subexpression of E. Then for any two symbols u and v in 
A(E1\ we have v G reachable(u) and u G reachable(v). 

Proof: From Lemma 2, there exists xl, x2 G first(El) * and zl, z2 G last(El) * 
such that u G f-reachable(xl), zl G f-reachable(u), v G f-reachable(x2) and z2 G 

20 f-reachable(v). On the other hand, there exists a iteration edge from z2 to xl and a 
iteration edge from zl to x2. We have found a cycle connecting xl , u, zl, x2, v, z2, and 
back to xl . This completes the proof. 

The Glushkov automata above are for simple regular expressions. For a general 
regular expression E' over a finite alphabet of symbols S'={yi, yi, y m } where a 

25 symbol may appear 2 or more times in the expression, one can map E' to a simple 
regular expression E over an alphabet E ={xi, x 2 , . . . 5 x n } by renaming each occurrence 
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of symbol yj C E' as a distinct symbol Xj € S. Let origin(Xj) = yi denote the original 
symbol yi that Xj represents. Let G be the Glushkov automaton constructed above for E. 
One can construct the Glushkov automaton G' for E' from G by replacing all labels Xj 
on edges in G by origin(xj). 
5 For instance, for the regular expression E' = ((a \ b)* a), E' can be mapped to the 

simple regular expression E = ((a \ b) * c) by renaming the second occurrence of a in E' 
to c. The Glushkov automaton G' for E' can then be constructed from the Glushkov 
automaton G shown in Figure 1 by replacing the label c on all edges in G by label a. The 
Glushkov automata G and G' are so similar that one can do all computations on G and 

10 map the results to G' for a general regular expression E' at the end. 
DTD-Compliant XML Editor 

Let E' be a regular expression over the alphabet of symbols £'= {yi, y2, . y m } 
and E its associated simple regular expression over alphabet S= {xi, X2, . . x n }. Also let 
G and G' be the Glushkov automata for E and E', respectively, as constructed above. 

15 Assume that E' is used as the content model of an element type. It is apparent that a 
valid document corresponds to a path in G from s to f, and vice versa. In other words, if 
a path in G from s to f is found, a valid document may be established. 

An XML editor with the goal of producing a "valid document" at the end cannot 
guarantee to produce a valid document during the process of the creation and edition of 

20 the document. However, the XML editor of this invention produces "DTD-compliant" 
documents during their creation and edition at all times. In a DTD-compliant document, 
the child elements of a parent element form a subsequence of a valid "child element 
sequence". Take the regular expression (a?, b, (c, d)+, e) as an example. The string, 
acdde, is not valid but is DTD-compliant since it is a subsequence of the string, abcdcde, 

25 which is valid. 

Using a DTD-compliant editor, a user starts with an empty document and makes 



element additions under the control of the system so that the working document is 
always DTD-compliant. A user can delete an arbitrary element from the working 
document; the resulting document is still DTD-compliant apparently. A user iterates the 
process of element addition and deletion so as to complete a desirable valid XML 
5 document eventually. A DTD-compliant document corresponds to a sequence {zi, 
Z2 ? . z p } of states in G where Zi=s, z p =f, and z l+ \ G reachable(zj), l<i<p-l. The 
document seen by the user is the sequence {origin(z2), origin(z 3 ), ., origin(z p .i)}. The 
main issue is, when the user intents to insert an element between a pair of consecutive 
states (elements), say z\ and z\+\, how to compute an appropriate set of candidate states 

10 (elements) for the user to select from so that the insertion of the selected element results 
in a DTD-compliant document. 

As a result, the major function of the present invention is to determine how a 
DTD-compliant element may be inserted between a pair of consecutive states during 
the edition of the XML document. In making such a determination, it does not mean 

1 5 that the XML editor of this invention checks all the elements of the current document to 
determine whether all of them are DTD-compliant or that, during or after the edition of 
the document, all elements of the document comply with the applicable DTD. 
Candidate State Set 

Consider the candidate state set C for an element slot to be inserted between two 
20 states u and v where v C reachable(u). A candidate state set is a representation of the 
"possible" paths for connecting u to v. A necessary condition for a state z between u and 
v in C is to satisfy z € reachable(u) and v C reachable(z). However, this condition is not 
sufficient. Consider the Glushkov automaton G for the regular expression E= (a, b, c* 
(d | e+)) * as shown in Figure 2. Due to the outer iteration of E, every state is reachable 
25 from every other state (except s and f). If states in C just had to satisfy the necessary 
condition, C would contain all states in any case. Such a necessary condition is too 
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loose to give accurate suggestions. The concept of "minimal" candidate state set that 
does not involve cycles and detours (e.g. unnecessary backward edges) in general is 
thus adopted in this invention. The minimal candidate state set presumes a "direct path" 
or "direct paths" from u to v. Cycles are allowed in a candidate state set only in 
5 restricted cases. The following lemma characterizes "direct" paths from u to v. 

Lemma 4. Let u and v be states in G for which v € reachable(u). Assume v g 
f-reachable(u). Then there exists a subexpression El* of E to include u and v such that 
w € f-reachable(u) for some w € last(El) and v € f-reachable(z) for some z C first(El). 
This establishes an acyclic path P connecting u to v. (If the subexpression El * is chosen 

10 to be the smallest subexpression to include u and v, such a path P is referred to as a 
minimal backward path.) 

Proof: Let E2 be the smallest subexpression of E to include both u and v. Assume 
to the contrary that E2 and all its ancestors are not iteration subexpressions, i.e. they are 
sequence or choice subexpressions. Consider any ancestor E3 of E2. E3 may only 

15 induce forward edges from states in its left subexpression to states in its right 
subexpression. However, both u and v are covered by one of its subexpression. Thus the 
forward edges induced by E3 have nothing to do with whether v is (forward) reachable 
from u. In other words, whether v is (forward) reachable from u depends on E2. 

If E2 is a choice subexpression, since u and v are covered by the two 

20 subexpressions of E2, respectively, u cannot reach v, which is a contradiction. In the 
case that E2 is a sequence subexpression where E2= (E3 5 E4), there are two cases: If u € 
E4 and v C E3, then u cannot reach v, which is a contradiction. If u G E3 and v C E4, 
then u can reach v through forward edges by applying Lemma 2 to both E3 and E4, 
which is also a contradiction. We have derived contradictions in all cases. Thus there 

25 must exist an iteration subexpression El * including both u and v. From Lemma 2, there 
exists w C last(El) such that w C f-reachable(u), and z C first(El) such that v C 
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f-reachable(z). Since (w 5 z) is an iteration edge, a path P connecting u to v is found. 

To prove that P is acyclic, assume to the contrary that P visits a state p more than 
once. Since the subpaths connecting u to w and z to v, respectively, are forward paths, p 
appears on each subpath only once. Now the two subpaths connecting u to p and p to v 
form a forward path connecting u to v. This contradicts the assumption v e 
f-reachable(u). Thus P is acyclic. 
Policies for Finding Candidate States 

To find out possible paths between two elements in a DTD-compliant document, 
the XML editor of this invention provides two policies, as follows: 

Policy FindCandidateStatesl: 

Given two states u and v where v C reachable(u) and (u,v) g H (H is the edge set of 
G), FindCandidateStatesl computes the candidate state set C for possible elements 
inserted between u and v. C is composed of the intermediate states in the acyclic paths 
from u to v determined by Lemma 4. (If the user wants a path with cycles, an acyclic 
path may be established first followed by cycles as Policy FindCandidateStates2 
supports.) 

Policy FindCandidateStatesl (u,v) // (u,v) g H 
IF v C f-reachable(u) THEN 

C= {x C Z | x C f-reachable(u) and v C f-reachable(x)} 
ELSE 

let El* be the smallest iteration subexpression of E that covers both u and v 
C= {x C A(E1) I x C f-reachable(u) or v C f-reachable(x)} 
ENDIF 

Policy FindCandidateStatesl allows the user to add a minimal set of elements 
between u and v in order to render the current document valid locally. Now refer to Fig. 
2 again. The state pair (a,e) satisfies e C f-reachable(a). Thus we have C= {b,c}. For the 
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state pair (c,a), a is not reachable from c through forward edges, c must reach a through 
d or e. Thus, we have C= {d,e}. 
Policy FindCandidateStates2: 

Given two states u and v where v G reachable(u) and (u,v) G H, 
5 FindCandidateStates2 computes a candidate state set C. Here (u,v) can be a forward 
edge, an iteration edge or both. If (u,v) is a forward edge, C is first computed as in 
FindCandidateStatesl. On the other hand, if u is the end or v is the beginning of an 
iteration or (u,v) is a backward edge, a new iteration can be inserted between u and v by 
adding its symbols to C. 
10 Policy FindCandidateStates2(u 5 v) // (u,v) G H 
IF (u ? v) is a forward edge THEN 

C= {x G 1 1 x G f-reachable(u) and v G f-reachable(x)} 

IF u G last(El*) for some iteration subexpression El* of E, 



and let El be the largest one, THEN 



15 



Cl= {x G A(E1) | v G f-reachable(x)} 



C-CUC1 



ENDIF 



IF v G first(E2*) for some iteration subexpression E2* of E, 



and let E2 be the largest one, THEN 



20 



C2= {x G A(E2) | x G f-reachable(u)} 



C= C U C2 



ENDIF 



ELSE /* (u,v) is a backward edge */ 



let E3* be the largest iteration subexpression of E 



25 



satisfying u G last(E3) and v G first(E3) 



C= A(E3) 
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ENDIF 

Consider the Glushkov automaton G in Figure 2. For the state pair (a,b), we have 
C= { }, which indicates no element should be inserted between a and b. For the state 
pair (b,d), we have C= {c}. For the state pair (b,c), since c G first(c*), we have C= {c}, 
5 which allows the user to iterate c. For the state pair (c,c), which is a backward edge, we 
have C= {c}. The user can add an iteration c between the two iterations. For the state 
pair (ej), since e C last(e*) and e G last(E) while E is the outer iteration, we have C= 
{a,b,c,d,e}. The user may want to add the inner iteration or the outer iteration. The 
system provides the user with all possibilities. 

1 0 With the candidate state set C cz S calculated as above, the system computes the 

candidate element list C\ where C'= {y G 2' | y= origin(x) for some x G C}, for the user 
to select from. When the user selects an element y from C\ y is mapped back to an 
element occurrence x G C for insertion into the current document. 
Hints for "Required Elements" and "Required Element Slots" 

15 In order to minimize the user's workload of element additions, the XML editor of 

this invention generates hints of "required elements" automatically upon the addition of 
an element. An element is deemed "required" if it is present in all valid documents 
containing the current document. This policy certainly applies to the empty document 
at initial stage, too. 

20 Consider the regular expression (a, (b, c)?). At initial stage, a is a required element. 

Once element b (or c) is entered, c (or b) becomes a required element. 

Since a required element is one that is present in all valid documents containing the 

current document, a required element between u and v corresponds to a state z through 

which all paths from u to v pass. Such a state is known as a cut-vertex or articulation 
25 point. It is possible to apply a maximum flow-like operation to find the articulation 

points separating u and v, i.e. to find the required elements between u and v. When there 
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are multiple articulation points, the maximum flow-like operation is executed 
repeatedly. 

The XML editor of this invention generates not only hints for elements but also 
hints for "element slots" for the user to fill in. Given two consecutive elements in the 
5 current document, if (u,v) g H where H is the edge set of G, then every valid document 
containing the current document contains at least one element between the two 
consecutive elements, a hint of element slot between the two elements is automatically 
generated. Such element slots may be called as "required element slots" since the user 
should typically provide appropriate elements for these placeholders in order to render 
10 the current document valid. While generating an element slot, the editor also generates 
an associated candidate state set C using Policy FindCandidateStatesl. The user must 
fill in an element slot by selecting from its candidate element list C' a desired element. 

Consider an element with its content defined by regular expression (a, (b \ c)). 
While the system generates the required element a, it also generates a required element 
1 5 slot following a, together with a candidate element list containing b and c for the user to 
select. 

It should be noted that the XML editor of this invention generates hints for 
required element slots only when the current document is not yet valid. When the user 
has filled up all required element slots and the system generates no more required 

20 element slots, the current document becomes "valid". 
"Optional Element Slots" 

In additional to generating hints for required elements and required element slots 
automatically, the XML editor of this invention generates hints for "optional elements" 
or "optional element slots" upon the user's requests. According to the XML editor of 

25 this invention, when a user wishes to add an element spontaneously, it is not necessary 
to point to the exact position for insertion. The user selects an existing element as a 
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reference position (e.g. highlighted) and asks the system to generate hints for all 
possible elements at positions close to the reference position. These hints are referred to 
as "optional element slots" and the elements to be filled into the slots "optional 
elements". Given two states u and v where v G reachable(u) and (u,v) G H, the editor 
5 computes the candidate state set C using Policy FindCandidateStates2, and generates an 
optional element slot between u and v if C is not empty. The user can then fill in 
optional element slots in the same way as in the required element slots. 

Consider the regular expression (a,b)+. Initially, the system generates a and b as 
required elements, which results in the state sequence sabf. Applying Policy 

10 FindCandidateStates2 to each pair of consecutive states, the resulting candidate state 
sets are: C= { } for the pair (a,b) and C= {a,b} for the pairs (s,a) and (bj). Thus the 
system generates an optional element slot between s and a, and one between b and f> but 
none between a and b. 

It should be noted that element slots are placeholders for elements rather than real 

15 elements. Conceptually they are entities displayed on the user interface rather than 
something in the working document. The XML editor of this invention can remove 
optional element slots from the user interface upon the user's request. 
Form-Based User Interface 

The DTD-compliant XML document editor of this invention can be considered as 

20 a user-interface-neutral process layer below the user interface layer. Different types of 
user interfaces can be implemented on top of this process layer. The invented XML 
editor appears suitable especially for supporting user interfaces based on tree views, 
presentation views and forms. A form-base user interface according to the XML 
document editor of this invention is illustrated as the embodiment of this invention, 

25 without limiting the scope of this invention, as follows. 

In this embodiment, the CDA DTD is taken as example to demonstrate the 



16 

form-based user interface of the DTD-compliant XML document editor of this 
invention. Figure 3 shows the top-level form when an empty document is created. This 
form displays two levels of elements: levelone has two required child elements 
clinical ^document Jieader and body; clinical document header has 5 required child 
5 elements, body contains a required element slot as its child for the user to fill in. These 
hints of required elements and required element slots are generated by the system 
automatically. The system has a parameter LEVEL_LIMIT that determines the number 
of levels of elements a form may contain. An element at the bottom level in a form 
appears as a hyperlink if it may have child elements. One can click it to display a child 

10 form that shows its child (and grandchild, etc.) elements. A child form may have child 
forms again so that child forms may nest indefinitely. 

Note that if LEVELLIMIT is large, the form has essentially a tree view. It is 
believed that tree views with deep structures are hard conceptually and confusing 
visually for ordinary users. Thus LEVEL LIMIT is typically set to a small number so 

1 5 that the form appears more like a conventional form than a tree view. Of course, anyone 
skilled in the art may adjust the depth of nested forms by setting LEVEL LIMIT to an 
appropriate value. This is important for a form-based XML editor, which lacks a global 
structural picture of the XML document. 

As shown in Figure 3, each form has two buttons for displaying and hiding all 

20 optional element slots, respectively. An alternative way is to display optional element 
slots only around a selected element. When one moves the pointer over an element, a 
small OPTIONAL icon pops up following the element in the same row as shown in the 
figure. If the user clicks the icon, the optional element slots around the current element 
are displayed as shown in Figure 4. Here the bound of neighbors is 2. Thus at most 2 

25 optional element slots are displayed preceding (following) the current element. On the 
other hand, if the user does not click OPTIONAL and moves the pointer out of the 



current element, the OPTIONAL icon pops off immediately. 

Also shown in Figure 4, the user can fill in an element slot by selecting from a 
menu of candidate elements generated by the system. Suppose that, in this case, the user 
has selected the element section. By clicking section to display a new child form, the 
5 user is allowed to work on section's child and grandchild elements. Figure 5 shows a 
scenario where the user has added a few attributes and child elements to the section 
element. Attributes and child elements form two groups with attributes preceding child 
elements. 

As shown in Figure 5, when the user moves the pointer over an element or attribute, 
10 three small icons, OPTIONAL, DELETE and CHANGE, may pop up following the 
element or attribute in the same row. By clicking DELETE (CHANGE), the user can 
remove the current element or attribute (change the tag of the current element). 

The user interface of the DTD-compliant editor of this invention supports a 
level-limited tree view so as to represent the flexible and dynamic structure of XML 
1 5 documents. It frees the user from any syntactic concerns and generates element slots for 
the user to fill in. Such a form-based XML editor can also serve as a form generator 
based on an XML data model. 

It should be noted that the DTD-compliant XML document editor of this invention 
is user interface-neutral, being able to support document structure-independent layouts 
20 as well. 

Effects of the Invention 
The XML editor of this invention allows the user to create and edit 
DTD-compliant XML documents, while syntactic violations are avoided. The XML 
editor generates hints for required elements and required element slots automatically in 
25 the creation and edition of documents so as to guide the user to produce valid 
documents. The editor also generates optional element slots upon the user's request. As 



IS 

a consequence, the user's major work is simply to fill in element slots by selecting 
candidate elements from system-generated a list. The user requires no knowledge about 
XML and DTD to edit DTD-compliant XML documents. 

The invented XML document editor is user-interface-neutral, being able to support 
5 user interfaces based on tree views, presentation views and forms. In particular, by 
combining the DTD-compliant process with the simplicity of forms, a simple XML 
editor with forms as its user interface is provided. Such a form-based XML editor can 
also serve as a form generator based on an XML data model. 

Using the DTD-compliant XML editor of this invention, a user's action is in 
10 principle under the control of the system but the system does not over-constrain the user. 
The user can select any element from a candidate element list. Elements do not have to 
be added in a certain order. DTD-compliance is not a strict constraint indeed. On the 
other hand, since the user's action is under the control of the system, a user can never 
violate the syntactic constraints. Consequently users don't have to be aware of the 
15 syntax of the document. 

As the present invention has been shown and described with reference to preferred 
embodiments thereof, those skilled in the art will recognize that the above and other 
changes may be made therein without departing form the spirit and scope of the 
invention. 

20 



